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DISCUSSION 


During this reporting period, very substantial progress was made 
in this investigation. To give some perspective to this progress, the 
status of the investigation at the end of March, 1973 will be briefly 
summarized. 

Status of Investigation at the End of March, 1973 

ERTS-1 image 1049-17324 which covers an area in southern Arizona, 
around Phoenix, had been analyzed. Initial attempts to process the 
digital data of this image through our Pattern and Terrain Classifica- 
tion Software System had failed. Then, detailed analysis of the data 
showed the presence of noise which was responsible for the failure of 
the algorithms to recognize reliably the various terrain types. Once 
the noise was discovered, it was quickly traced to small calibration 
errors of the MSS detectors (there are six detectors per spectral band) . 
We filtered the noise in the Fourier domain and proceeded to develop new 
signatures for the various terrain types. Finally, the data was sub- 
jected to automatic terrain recognition by the modified PTCS software. 
The recognition results for most terrain types were very good: 97% for 

desert, 89% for farms, 80% for mountains, 74% for urban areas, 98% for 
clouds, 100% for water, 81% for cloud shadows. Only river flood plains 
which are peculiar geographic features of southern Arizona were recog- 
nized poorly (11%). The accuracy of recognition was determined by com- 
parison to existing maps, high altitude (U-2 missions in September and 
December, 1971) and low-altitude aerial photography (mission 212 of the 
Earth Resources Aircraft Program, from the Manned Spacecraft Center in 
Houston, Texas) . The accuracy of the cloud recognition was determined 
by photointerpretation of the ERTS image since the aerial photography 
was obtained on different dates. 

The automated recognition of terrain types was described in a pa- 
per given to the ERTS-1 Symposium on March 5-9, 1973 sponsored by NASA/ 
GSFC. The title of this paper is "Terrain Type Recognition Using ERTS-1 
MSS Images" . The key to the recognition process described above is a 
heuristically determined algorithm which utilizes the new spatial fea- 
tures developed as well as the brightness information of the MSS-5 band. 

Selected Course of the Investigation 

At the end of March, the status of the investigation was thoroughly 
reviewed with the scientific monitor, Mr. William Alford, and the follow- 
ing course was selected: 
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It was decided to combine the multispectral data with the 
spatial features in the recognition algorithm. The earth 
resources information present in the ERTS images is con- 
veyed by individual picture elements by their brightness 
values in the four spectral bands and collectively by the 
picture element distribution over the image . Both types 
of information (spectral and spatial) must be exploited. 

2. The heuristic algorithm produced excellent results but 
took a lot of time to develop. Since the purpose of 
automatic processing is to compete economically with 
manual photointerpretation, it appeared that the heuris- 
tic algorithm was not cost effective when employed with 
a general purpose computer. It was decided, therefore, 
to employ the maximum likelihood criterion algor itm in- 
stead. This algorithm has been in wide use by many in- 
vestigators for analyzing multispectral and/or spatial 
feature data. The results have not always been good. 

We suspected that this algorithm produces errors when 
the various resource classes to be recognized do npt have 
Gaussian statistics. It was decided, therefore, . that the 
statistics of the classes be analyzed and that nonlinear 
transformations of the data be developed which would 
render this algorithm's results more accurate. 

3. It was also decided to introduce "clustering" tech- 
niques which are necessary to avoid requiring training 
data for each ERTS image. The criterion to be used 
for clustering is also dependent on Gaussian statistics 
so, as a first step it is necessary to show that the 
criterion behaves properly. 

Status of the Investigation at the End of May, 1973 

1. Preprocessing of ERTS-1 Images 

The preprocessing operations have been completed for the 
following ERTS-1 images: 

1031-17325 from Phoenix, Arizona 
1049-17324 from Phoenix, Arizona 
1103-17332 from Phoenix, Arizona < 

1040-18201 from Cascade Mountains 
1077-18260 from Cascade Mountains 


1015-17415 from Salt Lake, Utah 



We also tried to preprocess image 1069-17420 but found it was 
an 800BPI tape mislabeled as a 556BPI. The preprocessing op- 
erations involve taking data from the NASA delivered bulk 
CCT's, combining the data over the area of interest from two 
CCT's, separating the spectral bands, resampling each band 
to achieve egual pixel spacing on the ground, filtering the 
noise and other artifacts such as missing lines and record- 
ing individual images in the laser beam recorder to verify 
that each band has been properly preprocessed and no noise 
or other problems remain in the data. 

Six spatial features were combined with three spectral bands 
(MSS 4, 5 and 7) to produce a 9-dimensional vector for each 
image cell (32 x 32 picture elements). The maximum likeli- 
hood criterion was employed. We found that the covariance 
matrix for mountains had a singularity and mountains could 
not be recognized. Two of the spatial features were used 
for recognizing clouds and water only. At this point we 
decided to remove these two features and we have been using 
a seven-dimensional vector ever since. Clouds and water 
are still recognizable but not by the maximum likelihood 
criterion. 

Using the seven-djum^nsional vectors and image 1049-17324 
as training data, the statistics (covariance matrices) for 
desert, farm, mountains and city were computed. Then, 
using the statistics, the same input data and the maximum 
likelihood criterion, the cells were reclassified in one 
of the four categories. The recognition accuracy was 54% 
for desert, 92% for farms, 97% for mountains, and 92% for 
cities. The classification accuracy for farms, mountains 
and cities improved over the results obtained in March, 
using the spatial information only and the heuristic al- 
gorithm. Desert was poorly recognized (many desert cells 
were assigned to the mountain category) , and we suspected 
that the statistics for the classes were not Gaussian. 

Analysis of histograms of the classes showed that no com- 
ponent of any class was even approximately Gaussian. The 
distributions of each component (spectral band or spatial 
feature) appeared like the Rayleigh rather than the Gaussian. 
All components were positive with small means ^nd large stan- 
dard deviations. 



To make the distributions approximately Gaussian various non- 
linear transformations were applied on all seven vector com- 
ponents. We found that various logarithms worked fairly well 
but the resultant distributions are sensitive to amplitude 
variations in the data. In other words, if the logarithm on 
a certain base is used for one component (for example, the 
MSS 7 band) the distribution for each class may or may not 
become approximate Gaussian depending on solar illumination 
which changes the range of values obtained in this band. 
Finally, the nonlinear transformations we have selected are 
all powers less than 1. For image 1049-17324 the powers 
for the various components range from 0.8 to 0.025 and have 
been optimized to produce excellent recognition results in 
all four classes: desert 89%, farms 97%, mountains, 96%, and 
city 95%. In arriving at an optimum transformation for each 
component, Gaussian distributions cannot be achieved for all 
classes. In other words, as one adjusts the transformation 
of say the MSS 5 band, the desert distribution may become 
more Gaussian while the city distribution may become less 
Gaussian. There is no reason to believe that all distribu- 
tions can always be made Gaussian in all components by ap- 
propriate transformations. To reduce the possibility of 
such problems and maintain high recognition rates, the num- 
ber of different classes should be kept small. 

Also, the importance of each component in the recognition 
of each class is not the same . A component may be essential 
for recognizing one class and relatively unimportant for 
recognizing other classes. Knowledge of this can be taken 
advantage of when selecting a transformation for a component. 
The transformation is adjusted so that an approximately 
Gaussian distribution is obtained for the class that the 
component is most important. The effectiveness of each 
transformation can be judged from the recognition rates 
achieved. (See above.) 

There has been considerable work with the "clustering algo- 
rithms". In particular, the divergence criterion between 
classes was examined. This criterion allows a determination 
of the statistical separability of any two Gaussian classes. 

If the divergence is greater than 10, then the probability 
that a vector belonging to one of two classes can be cor- 
rectly identified, is greater than 80%. The divergence is 
also dependent on Gaussian statistics. Before the nonlinear 
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transformations of the vectors (when the classes were highly 
non-Gauss ian) the divergence values varied between 400 and 
6 x 10® and seemed to bear no relationship to the recognition 
rates achieved. After the nonlinear transformations were 
completed rendering the classes approximately Gaussian, the 
divergence values took on a more reasonable range (20 - 100) 
and generally speaking the higher divergence values were 
associated with pairs of classes between which very few er- 
rors were made. 

For the next reporting period, we hope to complete the following 

tasks: 

1. Complete all preprocessing operations for three more images 

2. Complete clustering algorithm software development 

3. Process eight images through the clustering and recognition 
algorithms. 



